Auditlog: Add django-pghistory as audit log (optional for now)#13126
Auditlog: Add django-pghistory as audit log (optional for now)#13126valentijnscholten wants to merge 19 commits intoDefectDojo:devfrom
django-pghistory as audit log (optional for now)#13126Conversation
1933759 to
5058de8
Compare
cd137fb to
b69c2fe
Compare
🟡 Please give this pull request extra attention during review.This pull request introduces multiple security and reliability regressions: it allows IP spoofing in audit logs via direct use of X-Forwarded-For, uses icontains filters that can force full-table scans causing DB DoS, and exposes audit information that permits username enumeration and disclosure of sensitive fields. It also adds operational risks and fragile code—silent-destructive --force log clearing, a hardcoded excluded-fields map that can leak secrets, broad exception handling that can silently disable audit triggers, an initializer that permits starting with missing migrations, and template changes that enable stored/reflected XSS by rendering unvalidated/unsafe audit data and URLs.
🟡 Potential Cross-Site Scripting in
|
| Vulnerability | Potential Cross-Site Scripting |
|---|---|
| Description | File: dojo/templates/dojo/action_history.html -- The patch directly renders user-controllable values into the HTML response in multiple locations that can lead to XSS: 1) <a href="{{ h.url }}" ... title="{{ h.url }}">: The template inserts h.url directly into the href attribute and into the title attribute. Even though Django templates escape HTML special characters, an attacker-controlled value such as a "javascript:alert(1)" URI will remain a valid URI in the href and will execute when clicked. Using user-provided URLs as href targets without validation allows injection of javascript: or other harmful schemes and is a reflected/DOM XSS vector. 2) {{ h.pgh_datapprint }} inside : The template outputs h.pgh_data after applying a pprint filter. If the pprint filter returns HTML or marks its output safe (or does not perform proper HTML-escaping), any HTML/JavaScript contained in stored pgh_data will be rendered by the browser. Even inside a |
django-DefectDojo/dojo/templates/dojo/action_history.html
Lines 4 to 158 in d89dd5e
Information Disclosure: User Enumeration via Audit Log Fallback Logic in dojo/engagement/signals.py
| Vulnerability | Information Disclosure: User Enumeration via Audit Log Fallback Logic |
|---|---|
| Description | The action_history view, accessible to users with Product_View or Engagement_View permissions, displays audit events. When an object (like an Engagement, Product, or Finding Group) is deleted, the system attempts to identify the deleting user from pghistory or django-auditlog. If a user is found, their username is explicitly included in the audit description displayed in the action_history view. This allows low-privileged users to enumerate the usernames of users (including potentially higher-privileged ones) who performed deletion actions on objects they have view access to. |
django-DefectDojo/dojo/engagement/signals.py
Lines 47 to 84 in d89dd5e
IP Address Spoofing in Audit Logs in dojo/middleware.py
| Vulnerability | IP Address Spoofing in Audit Logs |
|---|---|
| Description | The PgHistoryMiddleware in dojo/middleware.py directly uses the HTTP_X_FORWARDED_FOR header to determine the client's IP address for audit logging. This header is easily spoofable by an attacker. The code explicitly takes the first IP address from the comma-separated list in HTTP_X_FORWARDED_FOR. Without proper configuration of USE_X_FORWARDED_HOST and FORWARED_HOST_ALLOWED_IPS in Django settings, or a trusted proxy that sanitizes this header, an attacker can set this header to an arbitrary IP address, thereby falsifying their origin in the audit logs. This undermines the integrity and reliability of the audit trail, as audit logs could be manipulated to obscure an attacker's true source IP. |
django-DefectDojo/dojo/middleware.py
Lines 192 to 217 in d89dd5e
Denial-of-Service via inefficient icontains lookup on pghistory_context in dojo/filters.py
| Vulnerability | Denial-of-Service via inefficient icontains lookup on pghistory_context |
|---|---|
| Description | The PgHistoryFilter in dojo/filters.py uses icontains lookups for remote_addr and pgh_diff. The icontains lookup in Django translates to a LIKE '%value%' SQL query in PostgreSQL. PostgreSQL's standard B-tree indexes, even expression indexes created on (metadata->>'remote_addr') as seen in dojo/db_migrations/0244_pghistory_indices.py, are not effectively utilized for LIKE '%value%' queries. B-tree indexes are optimized for prefix matches (LIKE 'value%') or exact matches, but not for arbitrary substring searches. Similarly, the GIN index created on the entire metadata JSONB field (ON "pghistory_context" USING GIN ("metadata")) is primarily designed for JSONB-specific operators (e.g., @>, ?, ?, ?&) which check for key existence or containment of specific JSON structures. It does not efficiently accelerate LIKE '%value%' queries on text values extracted from within the JSONB document. Therefore, when a user applies a filter using icontains on remote_addr or pgh_diff, the database will likely perform a full table scan on the pghistory_context table (or the relevant event table, which joins with pghistory_context to get remote_addr and pgh_diff from metadata). If there is a large volume of audit log entries, this full table scan can consume significant CPU and I/O resources, leading to a denial-of-service (DoS) for the database and the application. |
django-DefectDojo/dojo/filters.py
Lines 3488 to 3554 in d89dd5e
Destructive Operation with --force option bypasses critical confirmation in dojo/management/commands/pghistory_clear.py
| Vulnerability | Destructive Operation with --force option bypasses critical confirmation |
|---|---|
| Description | The pghistory_clear.py management command includes a --force flag that, when used, bypasses the interactive confirmation prompt before executing a destructive operation (truncating or dropping pghistory audit log tables). This means that an attacker who has gained the ability to execute Django management commands could use this flag to silently and irreversibly delete audit log data, potentially covering their tracks without user interaction. |
django-DefectDojo/dojo/management/commands/pghistory_clear.py
Lines 1 to 206 in d89dd5e
Information Disclosure via Raw Audit Data in dojo/templates/dojo/action_history.html
| Vulnerability | Information Disclosure via Raw Audit Data |
|---|---|
| Description | The action_history.html template directly renders raw pgh_data and pgh_context fields from django-pghistory events. The pgh_data field contains a snapshot of all model fields (unless explicitly excluded), and pgh_context contains request details like remote_addr and url. This can lead to the unintended exposure of sensitive information (e.g., internal notes, PII like IP addresses, or sensitive GET parameters in URLs) to authorized users who access the audit history, even if they don't have direct access to those specific sensitive fields in the live object. |
django-DefectDojo/dojo/templates/dojo/action_history.html
Lines 4 to 158 in d89dd5e
Application Starts with Incomplete Database Schema in docker/entrypoint-initializer.sh
| Vulnerability | Application Starts with Incomplete Database Schema |
|---|---|
| Description | The entrypoint-initializer.sh script has been modified to allow the application to continue starting even if Django database migrations are detected as missing. This change transforms a critical failure condition into a warning. Running an application with an inconsistent database schema can lead to unpredictable behavior, data integrity issues, and potential security vulnerabilities if critical security-related model changes (e.g., new access control fields, data validation constraints) are not applied to the database. While the change aims to improve resilience, it introduces a risk of operating in an undefined state where security controls might not be fully enforced. |
django-DefectDojo/docker/entrypoint-initializer.sh
Lines 121 to 145 in d89dd5e
Potential for Information Disclosure of Sensitive Data via Audit Logs in dojo/management/commands/pghistory_backfill.py
| Vulnerability | Potential for Information Disclosure of Sensitive Data via Audit Logs |
|---|---|
| Description | The pghistory_backfill.py command uses a manually maintained, hardcoded dictionary excluded_fields_map to prevent sensitive fields from being written into the audit history during a backfill. While the currently identified sensitive fields (password, header_name, header_value) are correctly excluded in both the backfill script and the live django-pghistory configuration, this manual approach is fragile. If a new sensitive field is added to a model in the future, a developer might forget to update this exclusion list in pghistory_backfill.py (and potentially dojo/auditlog.py), leading to sensitive data being inadvertently written to the audit log tables. These logs may have different access controls and longer retention periods, increasing the risk of information disclosure. |
django-DefectDojo/dojo/management/commands/pghistory_backfill.py
Lines 1 to 265 in d89dd5e
Incomplete Error Handling for Audit Trigger Failures in dojo/auditlog.py
| Vulnerability | Incomplete Error Handling for Audit Trigger Failures |
|---|---|
| Description | The enable_django_pghistory, disable_django_pghistory, and configure_pghistory_triggers functions in dojo/auditlog.py use broad try...except Exception blocks when calling pgtrigger commands. If pgtrigger fails to enable or disable its database triggers (e.g., due to database connection issues, permissions, or misconfiguration), the application will log a warning but continue to start and operate. This leads to a silent failure of the django-pghistory audit logging system, resulting in a critical loss of security visibility as audit events will not be recorded, despite the system appearing to function normally. |
django-DefectDojo/dojo/auditlog.py
Lines 1 to 388 in d89dd5e
All finding details can be found in the DryRun Security Dashboard.
|
Superseded by #13169 on an upstream branch. |
Summary
As part of evaluating and improving import/reimport performance we have selected django-pghistory as the new auditlog for Defect Dojo.
It is entirely based on Postgres triggers to maintain a history/audit log. This makes it run faster and offers new features such as reverting a record to an old version, adding richer context to processes and events, finding events on related records, etc.
This PR doesn't introduce new features yet, it just adds
django-pghistoryas replacement fordjango-auditlog.EDIT:
This PR allows still for using
dang-auditlogas auditlog. Initially I planned to let the default todjango-auditlogin this PR, but then we would not be able to run the unit tests withdjango-pghistoryand vice versa. So I set the default todjango-pghistoryin this PR for now. We'll have to decide on a final approach before merging. The more I think of it, the more I think we should just go for a "big bang" release as catering for both auditlogs in the same codebase is just expensive to maintain, even for a short period. The changes in this PR are in a branch in the main repo, so not in my fork.In a future release
dhango-pghistorywill be the auditlog implementation anddjango-auditlogwill no longer be available for auditlogging.HowTo
To switch an instance over to
django-pghistory, some steps are needed:DD_AUDITLOG_TYPEtodjango-pghistorydocker compose exec -it uwsgi ./manage.py pghistory backfillto load the initial snapshot of data intodjango-pghistory. (this is NOT a data migration fromdjango-auditlog, but an initial snapshot from the current data in Defect Dojo).In the future both will be part of a new release which will perform the backfill automatically.
Once switched over, you cannot switch back (unless you know what you're doing).
The action history pages will always display the data from both
django-pghistoryanddjango-auditlog. As these data formats are completely different, there are two tables on the action history page.Some notes about the implementation
django-pghistoryworks with database triggers these are created as part of a migration regardless of which audit log is configured.django-auditlogdepending on the chosen auditlog type in settings.Performance
I did a couple of test runs to compare both audit log types using the JFrog Unified Very Many sample scan that contains ~14000 findings. These tests runs show a 20-30% speed improvement on my laptop running docker compose. In a (production) environment with an external database the difference might be bigger due to the increased latency.
django-pghistoryruns inside the database so there are a lot less network roundtrips needed.Scaling
djang-pghistorydocuments two settings or design decisions that affect performance: https://django-pghistory.readthedocs.io/en/3.4.3/performance/Row level vs Statement level triggers.
Defect Dojo doesn't do (m)any bulk inserts/updates on the tracked models. So there's no benefit now to switch to Statement level triggers.
If needed we can switch in the future using a one-time schema-only migration.
Denormalizing Context
Context is stored in its own table and events have a foreign key relation to this table. If the Context is intensively used and updated during processes/requests and lots of parallel requests are happening that trigger pghistory events, there could be some contention on that Context table. To "solve" this we could choose to store the context in a column in the Event tables. I don't think the typical use-case in Defect Dojo would really benefit from this, it could be detrimental as it would require more storage and filtering/extracting data from the context might be slower.
If needed we can switch in the future using a one-time schema+data migration.
Indices
By default
djang-pghistorydoesn't create much indices, only a couple of ones we don't need. Changes made:Remove indices on foreign key fields in the event tables (i.e. user_id, found_by, etc). We don't query on those.
Add index on fields that we query on like
pgh_created,pgh_label.Add index on
obj_idfield used in joins/queriesAdd index on fields from the json context, i.e.
user,remote_addr,urlNext steps
When testing is succesfull and we want to make
django-pghistorythe default, we need to do the following in a follow up PRinitial_importrecorddjango-auditlogdjango-auditlogfrom MIDDLEWAREdjango-auditloginstalled apps (not sure, it might need to remain to be able to load LogEntry model)